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ORIGINAL ARTICLE 

Capturing Drug Responses by Quantitative Promoter 
Activity Profiling 

K Kajiyama 12 , M Okada-Hatakeyama 3 , Y Hayashizaki 24 , H Kawaji 24 and H Suzuki 1 

Quantitative analysis of cellular responses to drugs is of major interest in pharmaceutical research. Microarray technologies 
have been widely used for monitoring genome-wide expression changes. However, this approach has several limitations in 
terms of coverage of targeted RNAs, sensitivity, and quantitativeness, which are crucial for accurate monitoring of cellular 
responses. In this article, we report an application of genome-wide and quantitative profiling of cellular responses to drugs. 
We monitored promoter activities in MCF-7 cells by Cap Analysis of Gene Expression using a single-molecule sequencer. We 
identified a distinct set of promoters affected even by subtle inhibition of the Ras-ERK and phosphatidylinositol-3-kinase-Akt 
signal-transduction pathways. Furthermore, we succeeded in explaining the majority of promoter responses to inhibition of 
the upstream epidermal growth factor receptor kinase quantitatively based on the promoter profiles upon inhibition of the two 
individual downstream signaling pathways. Our results demonstrate unexplored utility of highly quantitative promoter activity 
profiling in drug research. 
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In the development of new drugs, monitoring cellular 
responses to drug candidates is a fundamental approach 
for assessing the efficacy and safety of using these drugs. A 
range of molecular profiling approaches, such as mass spec- 
trometry, antibody-based proteomics, quantitative reverse 
transcriptase-polymerase chain reaction, and DNA micro- 
array, has been used for monitoring responses at the molec- 
ular (protein and RNA) level. 1-4 For instance, quantification 
of mRNA abundance is an effective way for monitoring gene 
expression changes in response to the drugs. 5-8 Microarray- 
based technologies have been widely used for monitoring 
such changes on a genome-wide scale 9 also in a context of 
drug effects on particular signaling pathways. 10-12 However, 
microarrays have several limitations in terms of coverage of 
targeted RNAs, sensitivity, and dynamic quantitative range 
because they rely on predesigned oligonucleotide probes 
and hybridization-based detection. 13 The quantitative limi- 
tation has forced researchers to use semi-quantitative 
interpretation of gene expression changes, for example, 
by rank-based analysis. 14-18 RNA-seq is one of the latest 
techniques for profiling the transcriptome 19 ' 20 by sequenc- 
ing random fragments of RNA; here, most of the protocols 
rely on second-generation sequencers and polymerase 
chain reaction-based amplification. Cap Analysis of Gene 
Expression (CAGE) is an alternative method for quantify- 
ing the transcriptome by sequencing the 5'-end of RNAs, 21 
and transcription start site profiles based on a polymerase 
chain reaction-dependent CAGE protocol are used as a ref- 
erence of promoter activities in quantitative modeling based 
on multiple epigenetic marks in the ENCODE consortium. 22 
Recently, we improved on the CAGE method by adapting 



it to a third-generation (single-molecule) sequencer, which 
allowed us to avoid any amplification steps from the library 
preparation to the sequencing reaction, suggesting that 
the resulting read counts represent the absolute number of 
observations of RNA presence. 2324 In this article, we ask 
whether cellular responses can be modeled quantitatively 
from the aspect of the transcriptome, in particular, promoter 
activities. We demonstrate quantitative modeling based on 
accurate quantification of subtle cellular responses induced 
by low-dosage drug treatment. 

RESULTS 

Promoter activity profiling of cellular responses to drugs 

We monitored the effects of U0126, wortmannin, and gefi- 
tinib on human breast cancer MCF-7 cells using the quanti- 
tative and genome-wide promoter profiling method. U0126 
and wortmannin are specific inhibitors of the Ras-ERK 
and phosphatidylinositol-3-kinase (PI3K)-Akt pathways, 
respectively (Figure 1a). Gefitinib is a potent inhibitor 
of the epidermal growth factor receptor (EGFR) kinase 
and mainly inhibits the Ras-ERK and PI3K-Akt pathways 
located downstream of this receptor. After determination of 
dosage of these drugs that show significant but not sat- 
urating effects on the cells (see Supplementary Figure 
S1 online), we prepared three replicate samples followed 
by CAGE profiling. On average, we obtained -14 million 
reads mapped on the reference genome per sample. By 
aggregation of neighboring transcription start sites (see 
Supplementary Methods online for detailed parameters 
and thresholds), we defined 10,298 promoters with char- 
acteristics consistent with those in a previous research 23 
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Figure 1 Promoter-based expression profiling of drugs targeting the epidermal growth factor receptor (EGFR) pathway, (a) Schematic 
representation of EGFR pathways and the drug-targeting sites. Ligand-mediated dimerization of the EGFR induces autophosphorylation 
of the EGFR tyrosine kinase, which activates downstream signal-transduction pathways, mainly Ras-ERK and phosphatidylinositol- 
3-kinase (PI3K)-Akt, which regulate gene expression. Gefitinib, U0126, and wortmannin directly inhibit the activity of EGFR, ERK, 
and Akt, respectively. The activation status of EGFR and Ras-ERK and PI3K-Akt pathways were monitored by measurement of the 
phosphorylation status of EGFR, ERK, and Akt, respectively (marked by orange circles), (b) A scatter plot of promoter activity (tags 
per million) between two biological replicates of gefitinib-treated MCF-7 cells, (c) Comparison of U0126- and wortmannin-treatment 
profiles (left panel showing promoter activities monitored by CAGE and right showing gene expressions monitored by microarray). The 
log2 fold change of each promoter activity (gene expression) against non-drug treatment control is plotted. The promoters (genes) 
significantly affected by either U01 26 or wortmannin treatment are color coded as blue and green, respectively, and by both treatments 
as red. False-discovery rate (FDR) <2% for Cap Analysis of Gene Expression (CAGE) and B-statistics >4 for microarray were used 
to select an almost equal number of the affected promoters/genes. EGR1, FOS, and CCND1 are known to be regulated by the Ras- 
ERK pathway and ErbB3 by the PI3K-Akt pathway, (d) Percentage of promoters significantly affected by either U0126 or wortmannin 
treatments or both. 



(see Supplementary Figures S2-S4 and Table S1 

online). Of note, even when we treated with drugs at 
low concentrations, promoter activities across triplicate 
samples were highly reproducible (average of three drug 
samples and standard deviation of Pearson's correlation 
coefficient = 0.9984 ±0.001 6; a scatter plot of the biologi- 
cal replicates is shown in Figure 1b and Supplementary 
Figure S5 online). By differential analysis comparing with 
no drug treatment, we identified 139, 168, and 157 promot- 
ers significantly affected by U0126, wortmannin, and gefi- 
tinib treatment, respectively (false-discovery rate <2%; see 
Supplementary Table S2 online). Although drug treatment 



of the cells induces changes in gene expression, the mag- 
nitude of such changes is, in general, not very extraor- 
dinary because fundamentally they are of the same cell 
type. 5 ' 6 ' 9 U0126 significantly suppressed EGR1 and FOS, 
target genes of the Ras-ERK pathway, 25 wortmannin upreg- 
ulated ErbB3, which was previously reported to be upregu- 
lated in a PI3K-Akt and Fox03a-dependent manner, 26 and 
gefitinib suppressed CCND1 promoter activity, one of the 
target genes of the EGFR pathway. 27 Notably, FOS was 
not detected as a differentially expressed gene following 
U0126 treatment as by microarray profiling of the same 
RNA samples (Figure 1c). Taken together, the promoter 
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activity profiles that we generated in this study are highly 
reproducible across biological replicates, consistent with 
previous studies, and could quantify modest changes in the 
transcriptome at submaximal drug dosage. 

Quantitative comparison of drug responses 

We compared U0126- and wortmannin-treated profiles. 
Since they inhibit parallel but distinct pathways, one would 
expect distinct cellular responses. Figure 1c indicates that 
the promoter activities emphasize their differences (Pear- 
son's correlation coefficient -0.60), which are more evident 
than in gene expression profiles determined by microarray 
(Pearson's correlation coefficient ~ 0.92). Among the pro- 
moters significantly affected by the drugs, -15% of them are 
commonly altered (i.e., suppressed or activated), whereas 
-84% are affected by only a single drug (Figure 1d). Inter- 
estingly, a promoter, designated as chr7_-_74867620 (unan- 
notated promoter), was affected in opposite ways (see 
Supplementary Table S2 online), even though microarray 
analysis did not detect significant expression changes at our 
working drug dosage. Overall, the promoter activity-based 
cellular response profiling approach successfully identified 
a distinct set of promoters downstream of individual path- 
ways (-80% of the promoters), as well as a set of commonly 
affected promoters. 

Quantitative explanation of the EGFR inhibitory effect by 
downstream inhibitors 

We asked whether the profile for gefitinib could be 
explained by using the profiles for U0126 and wortmannin. 
As EGFR (a gefitinib target) primarily controls both Ras- 
ERK (target of U0126) and PI3K-Akt (target of wortman- 
nin) pathways (Figure 1a), we hypothesized that EGFR 
inhibition can be explained by individual inhibition of these 
two pathways at the transcriptome level. To test this hypoth- 
esis, we used multiple linear regression analysis by tak- 
ing gefitinib profiles as response variables and U0126 and 
wortmannin profiles as explanatory variables, where only 
promoters significantly affected by gefitinib are considered. 
Pearson's correlation coefficient of the regression model 
to the gefitinib profile, 0.749, is higher than the correla- 
tions of either U0126 or wortmannin profiles to the gefi- 
tinib profile (0.668 and 0.659, respectively; Figure 2a and 
Supplementary Figure S6 online). One could speculate 
that the correlation 0.749 can be achieved by combinatorial 
regression of any profiles, and thus we tested this possi- 
bility by performing the same regression analysis 100,000 
times with random permutations of explanatory variables, 
that is, either U0126 or wortmannin profiles (see Supple- 
mentary Method 5.2 online). The correlation coefficients 
of the result of permutations (green and blue histograms, 
respectively; see Supplementary Figure S6 online) did 
not become higher than the one of the regression model 
based on U0126 and wortmannin with the gefitinib pro- 
files, which indicates P value for the occurrence of such 
high correlation coefficient is <1 x10 -5 . This result cannot 
be obtained with microarray analysis because the number 
of genes whose expression changes at equivalent thresh- 
olds are reliably measured is too small at an equivalent 
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Figure 2 Linear regression model explaining the gefitinib treatment 
profile by using U0126 and wortmannin profiles, (a) Promoter activity 
profiles in gefitinib treatment and the regression model using U0126 
and wortmannin profiles. The promoters in which the regression 
model is consistent with the gefitinib profile (within 0.2-fold) are color 
coded as black, (b) Content of the promoters affected by gefitinib. 



threshold (see Supplementary Figure S7 online). Even 
if we take a different threshold, by considering the result 
that the microarray profiles of U0126 and wortmannin 
responses are very close each other as described earlier 
(correlation coefficients 0.92; Figure 1c), their combination 
will remain very close to their original profiles rather than to 
the gefitinib response. 

We focused on the promoters that are roughly modeled by 
using both U0126 and wortmannin profiles. Of the 157 pro- 
moters significantly affected by gefitinib treatment, the level 
of expression changes in 133 (84.7%) promoters is consis- 
tent with the regression model (the difference of log2 fold 
changes in the gefitinib profile to control and the regression 
model are within 0.2-fold as shown in Figure 2a, b and Sup- 
plementary Table S3 online for individual classifications). 
The remaining 24 promoters (15.3%) can be interpreted 
by postulating the existence of minor pathways affected 
by gefitinib only. For example, one of them may be a STAT 
pathway. It has been reported that two transcription factors, 
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STAT1 and STAT2, are perturbed by gefitinib in a manner 
independent of the Ras-ERK and PI3K-Akt pathways. 28 
Two of their target genes identified by ChIP experiments, 29 
TNFR and KRT23, are included in the 24 promoters (Fig- 
ure 2a). When focusing on the 133 promoters in which 
the regression model is consistent with the gefitinib pro- 
file, 50 (37.6%) are altered either by U0126 or wortman- 
nin, and 72 (54.1%) are unaffected by either treatment (see 
Supplementary Table S3 online and Figure 2b). This find- 
ing indicates that both U0126 and wortmannin profiles con- 
tribute independently to the regression model for majority of 
the promoters and that even subtle changes identified here 
reflect the actual perturbation at the transcriptome level. 

DISCUSSION 

This is the first report on promoter activity profiling of drug 
responses using nonamplified CAGE libraries followed 
by single-molecule sequencing. Promoter activity profiles 
in response to the inhibition of ERK and PI3K kinases by 
U0126 and wortmannin, respectively, in human breast 
cancer MCF-7 cells are highly reproducible and consis- 
tent with previous studies. The sensitive profiling achieved 
by single-molecule sequencing allowed us to distinguish 
modest quantitative changes in the transcriptome even at 
submaximal drug dosage, which is very difficult to achieve 
with microarray-based profiling. Furthermore, the quanti- 
tative profiling allowed us to propose that both the inhibi- 
tory profiles of U0126 and wortmannin were constitutive 
components of the transcriptome profiles obtained by 
inhibition of the EGFR kinase, which is located upstream 
of both ERK and PI3K pathways. Although there are still 
some unexplained effects when using those two profiles as 
explanatory variables, our regression model successfully 
explained the changes in majority promoters significantly 
altered by gefitinib (Figure 2a, b and Supplementary Table 
S3 online), and its statistical significance was confirmed by 
permutation test (see Supplementary Figure S6 online). 
Lange et al. explained protein phosphorylation in the same 
pathway by nonlinear model, which could explain the unex- 
plained effect by our linear model presented here comple- 
mentarily. 30 Nevertheless, the results here demonstrated 
the utility of sensitive and quantitative promoter activity 
profiling in elucidation of cellular responses to the drug. 

Quantitative transcriptome analysis is potentially widely 
applicable in determining target proteins and action 
mechanisms of uncharacterized compounds. To further 
verify the applicability of this approach, additional analysis 
should be performed for different kinds of drugs with known 
action mechanisms, for example, herceptin (anti-ErbB2 
receptor antibody), tamoxifen (estrogen receptor antago- 
nist), cisplatin (DNA synthesis inhibitor), 5-fluorouracil, 
methotorexate (metabolome inhibitor), or docetaxicel 
(microtubule assembly inhibitor), which have been widely 
used for cancer therapies. Nonetheless, this study paves 
the way for quantitative analysis of drug responses at the 
promoter level, and moreover, it is potentially applicable for 
the evaluation of combinatorial or serial drug treatment in 
a clinical setting. 



METHODS 

Cell culture and RNA preparation. The MCF-7 human breast 
cancer cell line was obtained from the American Type Culture 
Collection and maintained in Dulbecco's modified Eagle's 
medium (Invitrogen, Carlsbad, CA) supplemented with 10% 
fetal bovine serum. Before drug treatment, the medium was 
switched to Dulbecco's modified Eagle's medium supple- 
mented with 2% fetal bovine serum and the cultures were 
maintained overnight. Gefitinib (ZD1839) 31 (1 uM, a gener- 
ous gift from AstraZeneca, London, UK), U0126 32 (500 nM, 
Calbiochem, San Diego, CA), or wortmannin 33 (10nM, Naca- 
lai Tesque, Kyoto, Japan) was added and incubated with the 
cells for 6h. Cells treated with dimethyl sulfoxide were used 
as the no drug treatment control. After the treatment, the cells 
were washed with phosphate-buffered saline twice, and RNA 
was isolated using the miRNeasy Mini kit (QIAGEN, Hilden, 
Germany) and analyzed by Bioanalyzer (Agilent technology, 
Santa Clara, CA). 

Western blot. We confirmed the drug effect on the target pro- 
teins at the phosphoprotein level and defined the optimal drug 
concentrations so as to avoid a saturating effect. Cell lysates 
clarified by centrifugation were used in western blotting as 
described previously. 34 ERK1/2, phospho-ERK1/2(T202/ 
Y204), Akt, phospho-Akt(S473), phospho-EGFR(Y1068) 
antibodies were purchased from Cell Signaling Technology 
(Danvers, MA), and anti-EGFR antibody was purchased from 
Fitzgerald Industries International (Acton, MA). 

Production and analysis of single-molecule CAGE data. 
CAGE libraries for single-molecule sequencing were con- 
structed as described previously 23 and sequenced on Heli- 
Scope. Sequenced reads were filtered and mapped to the 
February 2009 human genome assembly in UCSC (hg19 
assembly). We grouped mapped reads overlapping within a 
few base pairs on the same strand into a single cluster as a 
promoter, and quantified promoter activities by counting the 
number of integrated reads in each cluster. Differential expres- 
sion analysis between the control and the drug-treated samples 
was performed by using the edgeR package of Bioconductor 
in the R statistical language. 35 False-discovery rate <2% was 
used as a criterion for the differential expression against con- 
trol experiments (see Supplementary Methods online). 

Microarray analysis. We performed microarray analysis for 
establishing a baseline of the drug effects caused by our 
working doses, which was used for comparison with the 
drug effects of single-molecule CAGE profiles. Five hun- 
dred nanograms of total RNA was subjected to the Sentrix 
Human-6 Expression BeadChip (lllumina) according to the 
standard lllumina protocols. Variance-stabilizing normaliza- 
tion of lllumina data and B-statistics calculations were car- 
ried out using the lumi and limma packages of Bioconductor 
in the R statistical language. 36-38 B-statistics >4 was used as 
criterion for the differential expression against control (see 
Supplementary Methods online). 

Linear regression model of the effect of EGFR inhibition. 
As a result of the differential analysis of gefitinib treatment 
against the control condition, we identified 157 promoters 
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that were significantly altered. We then modeled the expres- 
sion changes of these altered promoters by gefitinib treat- 
ment based on U0126- and wortmannin-response profiles. 
We defined the fold change of promoter activities by drug 
treatment (F ) as follows: 

F p,s= lo g 2 ( e P,s /E p)' 

where E p represents the expression of promoter p on the 
control and e ps represents the expression in sample s. We 
used a linear regression analysis to model the response of 
promoter activities after gefitinib treatment (F pgefitinib ). The for- 
mula of the linear regression is defined with error variable e p 
as follows: 



p.gefitinib " n p,U01 26 + ^p,wortmannin + e P' 



and optimized the parameters and p 2 ) by using the least 
squares approach, 0.7326 and 0.7391 , respectively. 

All of these analyses totally depend on how to quantify pro- 
moter expressions, while we constructed promoters opera- 
tionally as described in Supplementary Method 3.1 online. 
One could hypothesize that the regression produces totally 
different results when we used different definitions of pro- 
moter. To examine this possibility, we estimated parameters 
and p 2 ) with several definitions of promoter and found that 
the parameter values estimated above are representative, 
that is, within 2.5 percentile (0.5690 and 0.5930 for p : and 
p 2 ) and 97.5 percentile (0.7892 and 0.8713, respectively) in 
the promoter sets (see Supplementary Method 5.1 online 
for details). 
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Study Highlights 

WHAT IS THE CURRENT KNOWLEDGE ON THE 
TOPIC? 

Cap analysis of gene expression (CAGE) was 
used extensively in FANTOM and ENCODE 
projects to globally map transcription start 
sites, quantify these activities, and elucidate 
regulatory network of gene expression during 
cell differentiation. Although this method has 
great potential to advance pharmacometric 
researches, there are currently no examples 
where it has been successfully adapted to this 
type of analysis. 

WHAT QUESTION DID THIS STUDY ADDRESS? 

y Can quantification of transcriptome dynamics 
achieved by drug treatment be useful for evalu- 
ating signal combinatorial drug effects? 



WHAT THIS STUDY ADDS TO OUR KNOWLEDGE 

S Owing to its highly quantitative nature, promoter 
activity profiling using the nonamplified CAGE 
followed by single sequencing can be used to 
describe drug effects and to discriminate effects 
of different drugs even at submaximal dosages. 



HOW THIS MIGHT CHANGE CLINICAL 
PHARMACOLOGY AND THERAPEUTICS? 

y This method allows elucidation of transcriptome 
regulatory networks achieved by the drug treat- 
ment and is applicable for determining target 
biological pathways and action mechanisms of 
uncharacterized compounds. 
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